Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Unsupervised person recognition in TV broadcast

Identifieur interne : 000046 ( France/Analysis ); précédent : 000045; suivant : 000047

Unsupervised person recognition in TV broadcast

Auteurs : Johann Poignant [France]

Source :

RBID : Hal:tel-00958774

Descripteurs français

English descriptors

Abstract

In this thesis we propose several methods for unsupervised person identification in TV broadcast using the names written on the screen. As the use of biometric models to recognize people in large video collections is not a viable option without a priori knowledge of people present in this videos, several methods of the state-of-the-art proposes to use other sources of information to get the names of those present. These methods mainly use the names pronounced as source of names. However, we can not have a good confidence in this source due to transcription or detection names errors and also due to the difficulty of knowing to who refers a pronounced name. The names written on the screen in TV broadcast have not be used in the past due to the difficulty of extracting these names in low quality videos. However, recent years have seen improvements in the video quality and overlay text integration. We therefore re-evaluated in this thesis, the use of this source of names. We first developed LOOV (for LIG Overlaid OCR in Video), this tool extract overlaid texts written in video. With this tool we obtained a very low character error rate. This allows us to have an important confidence in this source of names. We then compared the written names and pronounced names in their ability to provide the names of person present in TV broadcast. We found that twice persons are nameable by written names than by pronounced names with an automatic extraction of them. Another important point to note is that the association between a name and a person is inherently easier for written names than for pronounced names. With this excellent source of names we were able to develop several unsupervised naming methods of people in TV broadcast. We started with late naming methods where names are propagated onto speaker clusters. These methods question differently the choices made during the diarization process. We then proposed two methods (integrated naming and early naming) that incorporate more information from written names during the diarization process. To identify people appear on screen, we adapted the early naming method for faces clusters. Finally, we have also shown that this method also works for multi-modal speakers-faces clusters. With the latter method, that named speech turn and face during a single process, we obtain comparable score to the best systems that contribute during the first evaluation REPERE

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

Hal:tel-00958774

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Unsupervised person recognition in TV broadcast</title>
<title xml:lang="fr">Identification non-supervisée de personnes dans les flux télévisés</title>
<author>
<name sortKey="Poignant, Johann" sort="Poignant, Johann" uniqKey="Poignant J" first="Johann" last="Poignant">Johann Poignant</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-24471" status="VALID">
<orgName>Laboratoire d'Informatique de Grenoble</orgName>
<orgName type="acronym">LIG</orgName>
<desc>
<address>
<addrLine>UMR 5217 - Laboratoire LIG - 38041 Grenoble cedex 9 - France Tél. : +33 (0)4 76 51 43 61 - Fax : +33 (0)4 76 51 49 85</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.liglab.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-300275" type="direct"></relation>
<relation active="#struct-51016" type="direct"></relation>
<relation active="#struct-3886" type="direct"></relation>
<relation name="UMR5217" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300275" type="direct">
<org type="institution" xml:id="struct-300275" status="OLD">
<idno type="IdRef">026388804</idno>
<orgName>Institut National Polytechnique de Grenoble </orgName>
<orgName type="acronym">INPG</orgName>
<date type="end">2006-12-31</date>
<desc>
<address>
<addrLine>46 avenue Félix Viallet 38031 Grenoble Cedex 1</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-51016" type="direct">
<org type="institution" xml:id="struct-51016" status="OLD">
<idno type="IdRef">026404796</idno>
<orgName>Université Joseph Fourier - Grenoble 1</orgName>
<orgName type="acronym">UJF</orgName>
<date type="end">2015-12-31</date>
<desc>
<address>
<addrLine>BP 53 - 38041 Grenoble Cedex 9</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ujf-grenoble.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-3886" type="direct">
<org type="institution" xml:id="struct-3886" status="OLD">
<idno type="IdRef">02640432X</idno>
<orgName>Université Pierre Mendès France - Grenoble 2</orgName>
<orgName type="acronym">UPMF</orgName>
<date type="end">2015-12-31</date>
<desc>
<address>
<addrLine>BP 47 - 38040 Grenoble Cedex 9</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.upmf-grenoble.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR5217" active="#struct-441569" type="direct">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Grenoble</settlement>
<region type="region" nuts="2">Auvergne-Rhône-Alpes</region>
<region type="old region" nuts="2">Rhône-Alpes</region>
</placeName>
<orgName type="university">Université Joseph Fourier</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Grenoble</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:tel-00958774</idno>
<idno type="halId">tel-00958774</idno>
<idno type="halUri">https://tel.archives-ouvertes.fr/tel-00958774</idno>
<idno type="url">https://tel.archives-ouvertes.fr/tel-00958774</idno>
<date when="2013-10-18">2013-10-18</date>
<idno type="wicri:Area/Hal/Corpus">000128</idno>
<idno type="wicri:Area/Hal/Curation">000128</idno>
<idno type="wicri:Area/Hal/Checkpoint">000041</idno>
<idno type="wicri:Area/Main/Merge">000143</idno>
<idno type="wicri:Area/Main/Curation">000142</idno>
<idno type="wicri:Area/Main/Exploration">000142</idno>
<idno type="wicri:Area/France/Extraction">000046</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Unsupervised person recognition in TV broadcast</title>
<title xml:lang="fr">Identification non-supervisée de personnes dans les flux télévisés</title>
<author>
<name sortKey="Poignant, Johann" sort="Poignant, Johann" uniqKey="Poignant J" first="Johann" last="Poignant">Johann Poignant</name>
<affiliation wicri:level="1">
<hal:affiliation type="laboratory" xml:id="struct-24471" status="VALID">
<orgName>Laboratoire d'Informatique de Grenoble</orgName>
<orgName type="acronym">LIG</orgName>
<desc>
<address>
<addrLine>UMR 5217 - Laboratoire LIG - 38041 Grenoble cedex 9 - France Tél. : +33 (0)4 76 51 43 61 - Fax : +33 (0)4 76 51 49 85</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.liglab.fr/</ref>
</desc>
<listRelation>
<relation active="#struct-300275" type="direct"></relation>
<relation active="#struct-51016" type="direct"></relation>
<relation active="#struct-3886" type="direct"></relation>
<relation name="UMR5217" active="#struct-441569" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-300275" type="direct">
<org type="institution" xml:id="struct-300275" status="OLD">
<idno type="IdRef">026388804</idno>
<orgName>Institut National Polytechnique de Grenoble </orgName>
<orgName type="acronym">INPG</orgName>
<date type="end">2006-12-31</date>
<desc>
<address>
<addrLine>46 avenue Félix Viallet 38031 Grenoble Cedex 1</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-51016" type="direct">
<org type="institution" xml:id="struct-51016" status="OLD">
<idno type="IdRef">026404796</idno>
<orgName>Université Joseph Fourier - Grenoble 1</orgName>
<orgName type="acronym">UJF</orgName>
<date type="end">2015-12-31</date>
<desc>
<address>
<addrLine>BP 53 - 38041 Grenoble Cedex 9</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.ujf-grenoble.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-3886" type="direct">
<org type="institution" xml:id="struct-3886" status="OLD">
<idno type="IdRef">02640432X</idno>
<orgName>Université Pierre Mendès France - Grenoble 2</orgName>
<orgName type="acronym">UPMF</orgName>
<date type="end">2015-12-31</date>
<desc>
<address>
<addrLine>BP 47 - 38040 Grenoble Cedex 9</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.upmf-grenoble.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR5217" active="#struct-441569" type="direct">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Grenoble</settlement>
<region type="region" nuts="2">Auvergne-Rhône-Alpes</region>
<region type="old region" nuts="2">Rhône-Alpes</region>
</placeName>
<orgName type="university">Université Joseph Fourier</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Grenoble</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="mix" xml:lang="en">
<term>Information retrieval in multimedia documents</term>
<term>Multimodale fusion</term>
<term>Person recognition</term>
<term>Video OCR</term>
</keywords>
<keywords scheme="mix" xml:lang="fr">
<term>Fusion multimodale</term>
<term>OCR dans les vidéos</term>
<term>Recherche d'information dans les documents multimedias</term>
<term>Reconnaissance de personnes</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">In this thesis we propose several methods for unsupervised person identification in TV broadcast using the names written on the screen. As the use of biometric models to recognize people in large video collections is not a viable option without a priori knowledge of people present in this videos, several methods of the state-of-the-art proposes to use other sources of information to get the names of those present. These methods mainly use the names pronounced as source of names. However, we can not have a good confidence in this source due to transcription or detection names errors and also due to the difficulty of knowing to who refers a pronounced name. The names written on the screen in TV broadcast have not be used in the past due to the difficulty of extracting these names in low quality videos. However, recent years have seen improvements in the video quality and overlay text integration. We therefore re-evaluated in this thesis, the use of this source of names. We first developed LOOV (for LIG Overlaid OCR in Video), this tool extract overlaid texts written in video. With this tool we obtained a very low character error rate. This allows us to have an important confidence in this source of names. We then compared the written names and pronounced names in their ability to provide the names of person present in TV broadcast. We found that twice persons are nameable by written names than by pronounced names with an automatic extraction of them. Another important point to note is that the association between a name and a person is inherently easier for written names than for pronounced names. With this excellent source of names we were able to develop several unsupervised naming methods of people in TV broadcast. We started with late naming methods where names are propagated onto speaker clusters. These methods question differently the choices made during the diarization process. We then proposed two methods (integrated naming and early naming) that incorporate more information from written names during the diarization process. To identify people appear on screen, we adapted the early naming method for faces clusters. Finally, we have also shown that this method also works for multi-modal speakers-faces clusters. With the latter method, that named speech turn and face during a single process, we obtain comparable score to the best systems that contribute during the first evaluation REPERE</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Auvergne-Rhône-Alpes</li>
<li>Rhône-Alpes</li>
</region>
<settlement>
<li>Grenoble</li>
</settlement>
<orgName>
<li>Université Joseph Fourier</li>
<li>Université de Grenoble</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Auvergne-Rhône-Alpes">
<name sortKey="Poignant, Johann" sort="Poignant, Johann" uniqKey="Poignant J" first="Johann" last="Poignant">Johann Poignant</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/France/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000046 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/France/Analysis/biblio.hfd -nk 000046 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    France
   |étape=   Analysis
   |type=    RBID
   |clé=     Hal:tel-00958774
   |texte=   Unsupervised person recognition in TV broadcast
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024